DRAFT Automatic Creation of Lexical Knowledge Bases : New Developments in Computational
نویسنده
چکیده
Text processing technologies require increasing amounts of information about words and phrases to cope with the massive amounts of textual material available today. Information retrieval search engines provide greater and greater coverage, but do not provide a capability for identifying the specific content that is sought. Greater reliance is placed on natural language processing (NLP) technologies, which, in turn, are placing an increasing reliance on semantic information in addition to syntactic information about lexical items. The structure and content of lexical entries has been increasing rapidly to meet these needs, but obtaining the necessary information for these lexical knowledge bases (LKBs) is a major problem. Computational lexicology, which began in somewhat halting attempts to extract lexical information from machine-readable dictionaries (MRDs) for use in NLP, is seeing the emergence of new techniques that offer considerable promise for populating and organizing LKBs. Many of these techniques involve computations within the LKBs themselves to create, propagate, and organize the lexical information. 1 Introduction Computational lexicology began in the late 1960s and 1970s with attempts to extract lexical information from machine-readable dictionaries (MRDs) for use in natural language processing (NLP), primarily in extracting hierarchies of verbs and nouns. During the 1980s, NLP began reaching beyond syntactic information with a greater reliance on semantic information, locating this information within the lexicon. After reaching a conclusion (in the early 1990s) that insufficient information could be obtained about lexical items from MRDs, new techniques have emerged to offer considerable promise for populating and organizing lexical knowledge bases (LKBs). An underlying reason for the realization of these techniques seems to be the increasing capability to deal with the large amount of data that must be digested to deal with the overall content and complexity of semantics. This discussion begins with the assumptions about large amounts of information in lexical entries and particular computations made with this information in NLP. From this starting point, the paper describes emerging techniques for populating and propagating information to lexical entries derived from existing information with the LKB. The primary motivations for extending lexical entries comes from a need to provide greater internal
منابع مشابه
A Tool For The Automatic Creation, Extension And Updating Of Lexical Knowledge Bases
A tool is described which helps in the creation, extension and updating of lexical knowledge bases (LKBs). Two levels of representation are distinguished: a static storage level and a dynamic knowledge level. The latter is an object-oriented environment containing linguistic and lexicographic knowledge. At the knowledge level, constructors and filters can be defined. Constructors are objects wh...
متن کاملAutomatic Thesaurus Generation from Raw Text using Knowledge-Poor Techniques
In addition to showing how lexical units are related within a eld, domain-speciic thesauri give an idea of what subjects are important to that eld and are thus useful at many points in an information system. The major impediment to creation of thesauri has been the cost of their manual creation. We present here a number of automatic techniques that jointly produce a rst draft of a thesaurus fro...
متن کاملA Survey on Portuguese Lexical Knowledge Bases: Contents, Comparison and Combination
In the last decade, several lexical-semantic knowledge bases (LKBs) were developed for Portuguese, by different teams and following different approaches. Most of them are open and freely available for the community. Those LKBs are briefly analysed here, with a focus on size, structure, and overlapping contents. However, we go further and exploit all of the analysed LKBs in the creation of new L...
متن کاملAcquiring Semantic Information in the TCL’s Computational Lexicon
Ontologies are the central component for the Semantic Web, since they can be used to explicitly represent the semantics of structured or semi-structured information. In this paper, we describe the recent developments of a lexical ontology named the TCL’s computational lexicon, which aims to serve as the core knowledge base for the Semantic Web. We focus on designing a new specification of the s...
متن کاملA Constraint-Based Approach for Computational Lexicon Construction
Ontologies are the central component for the Semantic Web, since they can be used to explicitly represent the semantics of structured or semistructured information. In this paper, we describe the recent developments of a lexical ontology named the TCL's computational lexicon, which aims to serve as the core knowledge base for the Semantic Web. We focus on designing a new specification of the se...
متن کامل